Name: JENNIFER AGU
Student_ID: 8882641
IntroductionΒΆ
Framing the ProblemΒΆ
This project's objective is to develop and evaluate a deep learning model for image classification into distinct categories. The model is trained using dataset with a collection of labeled images that must be processed before being fed into a neural network. Our goal is to investigate different methods to maximize the model's performance, such as training a custom CNN from scratch and transfer learning (using a pre-trained VGG16 model). We may evaluate the benefits of transfer learning over creating a model from scratch, especially for image classification tasks, by comparing the performance of these two models.
Import the required python LibrariesΒΆ
import os, shutil, pathlib
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import classification_report, confusion_matrix, precision_recall_curve
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import image_dataset_from_directory
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.applications import VGG16
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from collections import Counter
1. Obtain the Data: Get the Dogs vs Cats datasetΒΆ
# This should point to the small dataset of the Kaggle Dogs vs Cats competition that was created in a previous notebook
data_folder = pathlib.Path('../data/kaggle_dogs_vs_cats_small')
# Load the train dataset fron the data_folder/train folder path, resize the image to a 180 x 180 pixels and then group them in a batch size of 32
train_dataset = image_dataset_from_directory(data_folder / "train", image_size=(180, 180), batch_size=32)
# Load the validation dataset fron the data_folder/validation folder path, resize the image to a 180 x 180 pixels and then group them in a batch size of 32
validation_dataset = image_dataset_from_directory( data_folder / "validation", image_size=(180, 180), batch_size=32)
# Load the test dataset fron the data_folder/test folder path, resize the image to a 180 x 180 pixels and then group them in a batch size of 32
test_dataset = image_dataset_from_directory(data_folder / "test",image_size=(180, 180),batch_size=32)
Found 2000 files belonging to 2 classes. Found 1000 files belonging to 2 classes. Found 2000 files belonging to 2 classes.
2. EDA: Explore the data with relevant graphs, statistics and insightsΒΆ
- Explore the data and label batch shape for the train, test and validation dataset.
print('Train Data:')
for data_batch, labels_batch in train_dataset:
print("data batch shape for the train data:", data_batch.shape)
print("labels batch shape for the train data:", labels_batch.shape)
break
print('\nValidation Data:')
for data_batch, labels_batch in validation_dataset:
print("data batch shape for the Validation Data :", data_batch.shape)
print("labels batch shape for the Validation Data:", labels_batch.shape)
break
print('\nTest Data:')
for data_batch, labels_batch in test_dataset:
print("data batch shape for the test Data:", data_batch.shape)
print("labels batch shape for the test Data:", labels_batch.shape)
break
Train Data: data batch shape for the train data: (32, 180, 180, 3) labels batch shape for the train data: (32,) Validation Data: data batch shape for the Validation Data : (32, 180, 180, 3) labels batch shape for the Validation Data: (32,) Test Data: data batch shape for the test Data: (32, 180, 180, 3) labels batch shape for the test Data: (32,)
Explain: This shows that the train, validation and test dataset have a data batch size of (32, 180, 180, 3) which means the each batch have 32 images, 180 by 180 pixels size and the 3 which is the number of channels that is the image color are Red, Green, and Blue (RGB). Also the labels shape of 32 indicates one for each of the image in the batch which are 0 and 1 depending on whether its cat or dog.
- Explore the images in the train dataset
# Define a function to plot different sample images from the data set
def plot_sample_images(dataset, class_names):
plt.figure(figsize=(10, 10))
# Consider a single batch from the dataset
for images, labels in dataset.take(1):
# Loop through the first 9 images in the batch
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype("uint8")) # Display the image after converting it to the right format
plt.title(f"Label: {class_names[int(labels[i])]}") # Set the the title of the plot using the class name
plt.axis("off") # Hide the axis
plt.tight_layout()
plt.show() # Display the plot
class_names = train_dataset.class_names # Get the class names from the train dataset
plot_sample_images(train_dataset, class_names) # Call the function to plot the sample images from the train dataset
Explain: As we can see, there is variation in pose, background, lighting, and image quality among the images, which are of various orientations and belong to both classes (dogs and cats). This variation helps in improving the model's ability to generalize across real-world situations.
- Explore the distribution of classes (cats and dogs) in the training dataset
# Define a function to count the number of samples in each of the classes
def get_label_distribution(dataset):
label_counts = Counter()
for _, labels in dataset.unbatch():
label = int(labels.numpy())
label_counts[label] += 1
return label_counts
# Apply the function on the train dataset
label_dist = get_label_distribution(train_dataset)
# Map the label indices to their labels
label_names = [class_names[k] for k in label_dist.keys()]
label_values = list(label_dist.values())
# Plot the bar chart
plt.bar(label_names, label_values, color='skyblue')
plt.title("Class Distribution")
plt.xlabel("Class")
plt.ylabel("Number of Images")
plt.show()
# Display the counts of each labels
print(f"Label counts: {dict(zip(label_names, label_values))}")
Label counts: {'dog': 1000, 'cat': 1000}
Explain: We can see from this bar chart that the training dataset is balanced between the classes(cats and dogs) which is ideal for training a model because it will reduce the risk of the model being biased.
3. Train two networksΒΆ
3.1 Define a Neural Network of your choiceΒΆ
Defining the modelΒΆ
Convolutional Neural Network(CNN) are a subset of deep learning models created especially to process image data. Tasks like object identification, face recognition, image classification, and more are frequently performed with it. Here we will be using it for classification of cats and dogs.
The model takes an input of 180 by 180 pixel images with 3 channels (RGB) and consists of:
- 5 Convolutional layers with the filter size increasing from 32 to 256 to extract features from the images.
- 4 MaxPooling layers with a pool_size of 2 which is after most convolutional layer. it is used to reduce the spatial dimensions (height and width) and the computation to help the model focus the the features that are importance.
- Flatten layer does the work of converting the 2D feature to a 1D vector to prepare the data for the classification layer.
- The Dense output layer has a single neuron using sigmoid for the activation
inputs = keras.Input(shape=(180, 180, 3))
x = layers.Rescaling(1./255)(inputs)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
cnn_model = keras.Model(inputs=inputs, outputs=outputs)
cnn_model.summary()
Model: "model_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) [(None, 180, 180, 3)] 0
rescaling_1 (Rescaling) (None, 180, 180, 3) 0
conv2d_5 (Conv2D) (None, 178, 178, 32) 896
max_pooling2d_4 (MaxPooling (None, 89, 89, 32) 0
2D)
conv2d_6 (Conv2D) (None, 87, 87, 64) 18496
max_pooling2d_5 (MaxPooling (None, 43, 43, 64) 0
2D)
conv2d_7 (Conv2D) (None, 41, 41, 128) 73856
max_pooling2d_6 (MaxPooling (None, 20, 20, 128) 0
2D)
conv2d_8 (Conv2D) (None, 18, 18, 256) 295168
max_pooling2d_7 (MaxPooling (None, 9, 9, 256) 0
2D)
conv2d_9 (Conv2D) (None, 7, 7, 256) 590080
flatten_2 (Flatten) (None, 12544) 0
dense_3 (Dense) (None, 1) 12545
=================================================================
Total params: 991,041
Trainable params: 991,041
Non-trainable params: 0
_________________________________________________________________
Summary From this we could see that the total number of parameters is 991,041 and the trainable parameters is also 991,041.
We will be compling the CNN model using the binary crossentropy loss, RMSprop optimizer and we will use the accuracy as the evaluation metrics. We will also be using a callback to save the best model based the the validation loss. Then we will be training the model for 30 epochs using the train and validation dataset.
We will save the best-performing version in ./models/cnn_from_scratch.keras path. Store the training history to visualize or analyze later
# Compile the CNN model
cnn_model.compile(loss="binary_crossentropy",
optimizer="rmsprop",
metrics=["accuracy"])
# Define callbacks to enhance the training of the model
callbacks = [
keras.callbacks.ModelCheckpoint(
filepath="./models/cnn_from_scratch.keras",
save_best_only=True,
monitor="val_loss")
]
# Train the CNN Model
history = cnn_model.fit(
train_dataset,
epochs=30,
validation_data=validation_dataset,
callbacks=callbacks)
Epoch 1/30 63/63 [==============================] - 69s 1s/step - loss: 0.7048 - accuracy: 0.5255 - val_loss: 0.6932 - val_accuracy: 0.5000 Epoch 2/30 63/63 [==============================] - 62s 987ms/step - loss: 0.6968 - accuracy: 0.5280 - val_loss: 0.6868 - val_accuracy: 0.5070 Epoch 3/30 63/63 [==============================] - 59s 928ms/step - loss: 0.6901 - accuracy: 0.5655 - val_loss: 0.6700 - val_accuracy: 0.6040 Epoch 4/30 63/63 [==============================] - 57s 909ms/step - loss: 0.6561 - accuracy: 0.6275 - val_loss: 0.6727 - val_accuracy: 0.5860 Epoch 5/30 63/63 [==============================] - 59s 929ms/step - loss: 0.6399 - accuracy: 0.6395 - val_loss: 0.6365 - val_accuracy: 0.6340 Epoch 6/30 63/63 [==============================] - 59s 935ms/step - loss: 0.5953 - accuracy: 0.6855 - val_loss: 0.5945 - val_accuracy: 0.6680 Epoch 7/30 63/63 [==============================] - 75s 1s/step - loss: 0.5619 - accuracy: 0.7095 - val_loss: 0.6778 - val_accuracy: 0.6510 Epoch 8/30 63/63 [==============================] - 66s 1s/step - loss: 0.5324 - accuracy: 0.7350 - val_loss: 0.5754 - val_accuracy: 0.7180 Epoch 9/30 63/63 [==============================] - 63s 996ms/step - loss: 0.4820 - accuracy: 0.7565 - val_loss: 0.6545 - val_accuracy: 0.6660 Epoch 10/30 63/63 [==============================] - 60s 956ms/step - loss: 0.4356 - accuracy: 0.8035 - val_loss: 0.6981 - val_accuracy: 0.6490 Epoch 11/30 63/63 [==============================] - 61s 961ms/step - loss: 0.4028 - accuracy: 0.8220 - val_loss: 0.5764 - val_accuracy: 0.7200 Epoch 12/30 63/63 [==============================] - 60s 951ms/step - loss: 0.3468 - accuracy: 0.8570 - val_loss: 0.5783 - val_accuracy: 0.7220 Epoch 13/30 63/63 [==============================] - 63s 997ms/step - loss: 0.2905 - accuracy: 0.8695 - val_loss: 0.6532 - val_accuracy: 0.7380 Epoch 14/30 63/63 [==============================] - 62s 989ms/step - loss: 0.2288 - accuracy: 0.9050 - val_loss: 0.9340 - val_accuracy: 0.6930 Epoch 15/30 63/63 [==============================] - 62s 987ms/step - loss: 0.1818 - accuracy: 0.9300 - val_loss: 0.9695 - val_accuracy: 0.7040 Epoch 16/30 63/63 [==============================] - 60s 950ms/step - loss: 0.1400 - accuracy: 0.9465 - val_loss: 1.1002 - val_accuracy: 0.7060 Epoch 17/30 63/63 [==============================] - 64s 1s/step - loss: 0.1140 - accuracy: 0.9630 - val_loss: 1.1496 - val_accuracy: 0.7250 Epoch 18/30 63/63 [==============================] - 63s 1s/step - loss: 0.0981 - accuracy: 0.9645 - val_loss: 1.2450 - val_accuracy: 0.7340 Epoch 19/30 63/63 [==============================] - 68s 1s/step - loss: 0.0686 - accuracy: 0.9745 - val_loss: 1.6035 - val_accuracy: 0.6840 Epoch 20/30 63/63 [==============================] - 69s 1s/step - loss: 0.0748 - accuracy: 0.9710 - val_loss: 1.5866 - val_accuracy: 0.7120 Epoch 21/30 63/63 [==============================] - 77s 1s/step - loss: 0.0627 - accuracy: 0.9775 - val_loss: 1.3751 - val_accuracy: 0.7320 Epoch 22/30 63/63 [==============================] - 69s 1s/step - loss: 0.0881 - accuracy: 0.9745 - val_loss: 1.4056 - val_accuracy: 0.7280 Epoch 23/30 63/63 [==============================] - 64s 1s/step - loss: 0.0322 - accuracy: 0.9900 - val_loss: 1.8468 - val_accuracy: 0.7180 Epoch 24/30 63/63 [==============================] - 65s 1s/step - loss: 0.0786 - accuracy: 0.9765 - val_loss: 1.5832 - val_accuracy: 0.7320 Epoch 25/30 63/63 [==============================] - 65s 1s/step - loss: 0.0451 - accuracy: 0.9885 - val_loss: 1.9540 - val_accuracy: 0.7040 Epoch 26/30 63/63 [==============================] - 67s 1s/step - loss: 0.0529 - accuracy: 0.9810 - val_loss: 1.8376 - val_accuracy: 0.7160 Epoch 27/30 63/63 [==============================] - 65s 1s/step - loss: 0.0391 - accuracy: 0.9845 - val_loss: 1.9217 - val_accuracy: 0.7230 Epoch 28/30 63/63 [==============================] - 72s 1s/step - loss: 0.0561 - accuracy: 0.9815 - val_loss: 1.8028 - val_accuracy: 0.7290 Epoch 29/30 63/63 [==============================] - 91s 1s/step - loss: 0.0289 - accuracy: 0.9930 - val_loss: 1.9423 - val_accuracy: 0.7300 Epoch 30/30 63/63 [==============================] - 82s 1s/step - loss: 0.0220 - accuracy: 0.9930 - val_loss: 1.9190 - val_accuracy: 0.7350
Explain
Observing the result:
The training accuracy rate starts with 0.5255(approximately 53%) then increases to 0.9930 during epoch 30.
The validation accuracy was maximum at epoch 13 with a value of 0.7380.
The training loss decreases with time throughout.
The validation loss was lowest at epoch 8 with the value 0.5754.
# Extract the metrics from the history
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)
# figure size for the subplots
plt.figure(figsize=(12, 5))
# Plot the Accuracy
plt.subplot(1, 2, 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()
# Plot the loss
plt.subplot(1, 2, 2)
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()
plt.tight_layout()
plt.show()
Explain: Observing this visualize graph we could see that overfitting is evident, around epoch 8 where the validation loss is the lowest.
3.2 Fine-Tune VGG16 (pre-trained on imagenet)ΒΆ
- We will be loading the pretrained VGG16 model with ImageNet weights and exclude the top layers to be able to use the convolutional base as a feature extractor the the dataset
# Load the VGG16 model without the top layer and with pretrained ImageNet weights
conv_base = keras.applications.vgg16.VGG16(
weights="imagenet",
include_top=False
)
# Print model summary
conv_base.summary()
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) [(None, None, None, 3)] 0
block1_conv1 (Conv2D) (None, None, None, 64) 1792
block1_conv2 (Conv2D) (None, None, None, 64) 36928
block1_pool (MaxPooling2D) (None, None, None, 64) 0
block2_conv1 (Conv2D) (None, None, None, 128) 73856
block2_conv2 (Conv2D) (None, None, None, 128) 147584
block2_pool (MaxPooling2D) (None, None, None, 128) 0
block3_conv1 (Conv2D) (None, None, None, 256) 295168
block3_conv2 (Conv2D) (None, None, None, 256) 590080
block3_conv3 (Conv2D) (None, None, None, 256) 590080
block3_pool (MaxPooling2D) (None, None, None, 256) 0
block4_conv1 (Conv2D) (None, None, None, 512) 1180160
block4_conv2 (Conv2D) (None, None, None, 512) 2359808
block4_conv3 (Conv2D) (None, None, None, 512) 2359808
block4_pool (MaxPooling2D) (None, None, None, 512) 0
block5_conv1 (Conv2D) (None, None, None, 512) 2359808
block5_conv2 (Conv2D) (None, None, None, 512) 2359808
block5_conv3 (Conv2D) (None, None, None, 512) 2359808
block5_pool (MaxPooling2D) (None, None, None, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
Summary: From this we could see that the total number of parameters is 14,714,688 and the trainable parameters is also 14,714,688.
Explain: We will be retrieving data features and labels from a dataset through application of the pretrained VGG16 model. We will apply VGG16-specific preprocessing to image batches afterward images move through the convolutional base for feature representation before saving corresponding labels. The function will be applied to each of the training, validation and test datasets to retrieve extracted features and labels.
# Define a function to extract features and labels from the dataset using the VGG16 convolutional base
def get_features_and_labels(dataset):
all_features = []
all_labels = []
# Loop through each batch of the images and labels in the dataset
for images, labels in dataset:
preprocessed_images = keras.applications.vgg16.preprocess_input(images)
features = conv_base.predict(preprocessed_images)
all_features.append(features)
all_labels.append(labels)
return np.concatenate(all_features), np.concatenate(all_labels)
# Extract features and labels for train, validation and test datasets
train_features, train_labels = get_features_and_labels(train_dataset)
val_features, val_labels = get_features_and_labels(validation_dataset)
test_features, test_labels = get_features_and_labels(test_dataset)
1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 7s 7s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 7s 7s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 2s 2s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 3s 3s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 7s 7s/step 1/1 [==============================] - 6s 6s/step 1/1 [==============================] - 7s 7s/step 1/1 [==============================] - 6s 6s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 1s 888ms/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 6s 6s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 4s 4s/step 1/1 [==============================] - 5s 5s/step 1/1 [==============================] - 2s 2s/step
Defining the densely connected classifier
Summary: The weights of the pretrained VGG16 convolutional base will be frozen during training preventing them from updating. After that we will develop a custom classification model on top of it. The model uses a flattening layer followed by a dense hidden layer with ReLU activation and then adds a dropout layer for overfitting prevention and concludes with a sigmoid output layer for binary classification.
# Freeze the convolutional base
conv_base.trainable = False
# Build the model on top of the conv base
inputs = keras.Input(shape=(180, 180, 3))
x = conv_base(inputs)
x = layers.Flatten()(x)
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
vgg_model = keras.Model(inputs, outputs)
# Print model summary
vgg_model.summary()
Model: "model_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_6 (InputLayer) [(None, 180, 180, 3)] 0
vgg16 (Functional) (None, None, None, 512) 14714688
flatten_3 (Flatten) (None, 12800) 0
dense_4 (Dense) (None, 256) 3277056
dropout_1 (Dropout) (None, 256) 0
dense_5 (Dense) (None, 1) 257
=================================================================
Total params: 17,992,001
Trainable params: 3,277,313
Non-trainable params: 14,714,688
_________________________________________________________________
Explain: The model has a total of 17,992,001 parameters, out of which are 3,277,313 trainable parameters from the newly added layers with 14,714,688 non-trainable parameters from the frozen VGG16 base.
# Compile the VGG16 model
vgg_model.compile(loss="binary_crossentropy",
optimizer="rmsprop",
metrics=["accuracy"])
# Define callbacks to enhance the training of the model
callbacks = [
keras.callbacks.ModelCheckpoint(
filepath="./models/vgg16_model.keras",
save_best_only=True,
monitor="val_loss")
]
# Train the VGG16 Model
history = vgg_model.fit(
train_dataset,
epochs=50,
validation_data=validation_dataset,
callbacks=callbacks)
Epoch 1/50 63/63 [==============================] - 438s 7s/step - loss: 7.9452 - accuracy: 0.8870 - val_loss: 1.3846 - val_accuracy: 0.9400 Epoch 2/50 63/63 [==============================] - 357s 6s/step - loss: 1.4545 - accuracy: 0.9475 - val_loss: 1.9854 - val_accuracy: 0.9030 Epoch 3/50 63/63 [==============================] - 382s 6s/step - loss: 0.5191 - accuracy: 0.9645 - val_loss: 0.6793 - val_accuracy: 0.9470 Epoch 4/50 63/63 [==============================] - 400s 6s/step - loss: 0.2682 - accuracy: 0.9815 - val_loss: 1.0961 - val_accuracy: 0.9340 Epoch 5/50 63/63 [==============================] - 370s 6s/step - loss: 0.2148 - accuracy: 0.9855 - val_loss: 0.7397 - val_accuracy: 0.9490 Epoch 6/50 63/63 [==============================] - 343s 5s/step - loss: 0.1540 - accuracy: 0.9845 - val_loss: 0.7253 - val_accuracy: 0.9580 Epoch 7/50 63/63 [==============================] - 402s 6s/step - loss: 0.1767 - accuracy: 0.9895 - val_loss: 0.8830 - val_accuracy: 0.9540 Epoch 8/50 63/63 [==============================] - 401s 6s/step - loss: 0.1423 - accuracy: 0.9900 - val_loss: 0.9885 - val_accuracy: 0.9570 Epoch 9/50 63/63 [==============================] - 365s 6s/step - loss: 0.2535 - accuracy: 0.9880 - val_loss: 0.8698 - val_accuracy: 0.9580 Epoch 10/50 63/63 [==============================] - 374s 6s/step - loss: 0.1115 - accuracy: 0.9925 - val_loss: 0.9336 - val_accuracy: 0.9590 Epoch 11/50 63/63 [==============================] - 315s 5s/step - loss: 0.1098 - accuracy: 0.9950 - val_loss: 1.0048 - val_accuracy: 0.9540 Epoch 12/50 63/63 [==============================] - 392s 6s/step - loss: 0.1236 - accuracy: 0.9920 - val_loss: 1.0349 - val_accuracy: 0.9530 Epoch 13/50 63/63 [==============================] - 399s 6s/step - loss: 0.0195 - accuracy: 0.9980 - val_loss: 1.1741 - val_accuracy: 0.9470 Epoch 14/50 63/63 [==============================] - 377s 6s/step - loss: 0.1208 - accuracy: 0.9945 - val_loss: 1.5136 - val_accuracy: 0.9370 Epoch 15/50 63/63 [==============================] - 375s 6s/step - loss: 0.0923 - accuracy: 0.9945 - val_loss: 1.3525 - val_accuracy: 0.9520 Epoch 16/50 63/63 [==============================] - 378s 6s/step - loss: 0.0270 - accuracy: 0.9975 - val_loss: 0.9302 - val_accuracy: 0.9600 Epoch 17/50 63/63 [==============================] - 371s 6s/step - loss: 0.0283 - accuracy: 0.9965 - val_loss: 1.0255 - val_accuracy: 0.9530 Epoch 18/50 63/63 [==============================] - 383s 6s/step - loss: 0.0317 - accuracy: 0.9950 - val_loss: 1.2045 - val_accuracy: 0.9560 Epoch 19/50 63/63 [==============================] - 382s 6s/step - loss: 0.1295 - accuracy: 0.9945 - val_loss: 1.6302 - val_accuracy: 0.9470 Epoch 20/50 63/63 [==============================] - 377s 6s/step - loss: 0.0280 - accuracy: 0.9960 - val_loss: 1.3293 - val_accuracy: 0.9540 Epoch 21/50 63/63 [==============================] - 367s 6s/step - loss: 0.0766 - accuracy: 0.9940 - val_loss: 0.9235 - val_accuracy: 0.9590 Epoch 22/50 63/63 [==============================] - 347s 6s/step - loss: 0.0347 - accuracy: 0.9990 - val_loss: 0.9677 - val_accuracy: 0.9620 Epoch 23/50 63/63 [==============================] - 340s 5s/step - loss: 0.0522 - accuracy: 0.9970 - val_loss: 0.9182 - val_accuracy: 0.9610 Epoch 24/50 63/63 [==============================] - 338s 5s/step - loss: 0.0012 - accuracy: 0.9995 - val_loss: 0.9561 - val_accuracy: 0.9600 Epoch 25/50 63/63 [==============================] - 342s 5s/step - loss: 0.0090 - accuracy: 0.9990 - val_loss: 1.1006 - val_accuracy: 0.9550 Epoch 26/50 63/63 [==============================] - 329s 5s/step - loss: 0.0346 - accuracy: 0.9970 - val_loss: 0.8778 - val_accuracy: 0.9570 Epoch 27/50 63/63 [==============================] - 329s 5s/step - loss: 0.0278 - accuracy: 0.9975 - val_loss: 0.9842 - val_accuracy: 0.9580 Epoch 28/50 63/63 [==============================] - 328s 5s/step - loss: 0.0051 - accuracy: 0.9990 - val_loss: 1.3091 - val_accuracy: 0.9520 Epoch 29/50 63/63 [==============================] - 325s 5s/step - loss: 0.0173 - accuracy: 0.9980 - val_loss: 1.0597 - val_accuracy: 0.9550 Epoch 30/50 63/63 [==============================] - 321s 5s/step - loss: 0.0132 - accuracy: 0.9985 - val_loss: 1.1199 - val_accuracy: 0.9560 Epoch 31/50 63/63 [==============================] - 319s 5s/step - loss: 0.0285 - accuracy: 0.9970 - val_loss: 1.2835 - val_accuracy: 0.9540 Epoch 32/50 63/63 [==============================] - 318s 5s/step - loss: 0.0419 - accuracy: 0.9975 - val_loss: 1.0156 - val_accuracy: 0.9480 Epoch 33/50 63/63 [==============================] - 330s 5s/step - loss: 0.0086 - accuracy: 0.9990 - val_loss: 0.9055 - val_accuracy: 0.9620 Epoch 34/50 63/63 [==============================] - 327s 5s/step - loss: 0.0146 - accuracy: 0.9990 - val_loss: 0.9424 - val_accuracy: 0.9570 Epoch 35/50 63/63 [==============================] - 326s 5s/step - loss: 0.0113 - accuracy: 0.9995 - val_loss: 0.8861 - val_accuracy: 0.9640 Epoch 36/50 63/63 [==============================] - 339s 5s/step - loss: 4.8828e-04 - accuracy: 0.9995 - val_loss: 0.9661 - val_accuracy: 0.9630 Epoch 37/50 63/63 [==============================] - 328s 5s/step - loss: 0.0222 - accuracy: 0.9985 - val_loss: 1.1668 - val_accuracy: 0.9520 Epoch 38/50 63/63 [==============================] - 329s 5s/step - loss: 0.0059 - accuracy: 0.9995 - val_loss: 1.0634 - val_accuracy: 0.9580 Epoch 39/50 63/63 [==============================] - 339s 5s/step - loss: 2.1523e-08 - accuracy: 1.0000 - val_loss: 1.0646 - val_accuracy: 0.9580 Epoch 40/50 63/63 [==============================] - 327s 5s/step - loss: 0.0215 - accuracy: 0.9980 - val_loss: 1.4480 - val_accuracy: 0.9530 Epoch 41/50 63/63 [==============================] - 338s 5s/step - loss: 0.0402 - accuracy: 0.9980 - val_loss: 1.4700 - val_accuracy: 0.9500 Epoch 42/50 63/63 [==============================] - 342s 5s/step - loss: 0.0242 - accuracy: 0.9980 - val_loss: 0.9304 - val_accuracy: 0.9560 Epoch 43/50 63/63 [==============================] - 327s 5s/step - loss: 0.0224 - accuracy: 0.9985 - val_loss: 1.3972 - val_accuracy: 0.9480 Epoch 44/50 63/63 [==============================] - 333s 5s/step - loss: 0.0068 - accuracy: 0.9985 - val_loss: 0.9288 - val_accuracy: 0.9560 Epoch 45/50 63/63 [==============================] - 331s 5s/step - loss: 0.0068 - accuracy: 0.9995 - val_loss: 0.9100 - val_accuracy: 0.9600 Epoch 46/50 63/63 [==============================] - 337s 5s/step - loss: 0.0116 - accuracy: 0.9985 - val_loss: 0.8512 - val_accuracy: 0.9620 Epoch 47/50 63/63 [==============================] - 340s 5s/step - loss: 0.0046 - accuracy: 0.9995 - val_loss: 1.0673 - val_accuracy: 0.9610 Epoch 48/50 63/63 [==============================] - 327s 5s/step - loss: 0.0736 - accuracy: 0.9975 - val_loss: 1.0948 - val_accuracy: 0.9550 Epoch 49/50 63/63 [==============================] - 330s 5s/step - loss: 0.0012 - accuracy: 0.9995 - val_loss: 1.0798 - val_accuracy: 0.9550 Epoch 50/50 63/63 [==============================] - 329s 5s/step - loss: 0.0022 - accuracy: 0.9995 - val_loss: 1.0196 - val_accuracy: 0.9550
Explain:
Observing the result:
The training accuracy rate starts with 0.8870(approximately 89%) then increases to 0.9995(approximately 100%) during epoch 50 which suggest that the model was getting better in classifying the training dataset. Also noticed that at epoch 39 the training accuracy was 100%.
The validation accuracy was maximum at epoch 35 with a value of 0.9640 which suggest that the model performs very well in classifying the validation dataset.
The training loss was fluctuating that is increasing and decreasing with time throughout.
The validation loss was lowest at epoch 3 with the value 0.6793 which means the model is overfitting at this point.
# Extract the metrics from the history
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)
# figure size for the subplots
plt.figure(figsize=(12, 5))
# Plot the Accuracy
plt.subplot(1, 2, 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()
# Plot the loss
plt.subplot(1, 2, 2)
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()
plt.tight_layout()
plt.show()
Explain Observing this visualize graph we could see around epoch 3 was where the validation loss is the lowest.
4. Explore the relative performance of the modelsΒΆ
# Convert test dataset to NumPy arrays
test_images_list = []
test_labels_list = []
for images, labels in test_dataset:
test_images_list.append(images.numpy())
test_labels_list.append(labels.numpy())
test_images = np.concatenate(test_images_list)
test_labels = np.concatenate(test_labels_list)
4.1 AccuracyΒΆ
Accuracy for the CNN Model
test_model = keras.models.load_model("./models/convnet_from_scratch.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)
# Print the Accuracy for the CNN Model
print(f"Test accuracy: {test_acc:.3f}")
63/63 [==============================] - 21s 303ms/step - loss: 0.5636 - accuracy: 0.7220 Test accuracy: 0.722
Accuracy for the VGG16 Model
test_model = keras.models.load_model( "./models/vgg16_model.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)
# Print the Accuracy for the VGG16 Model
print(f"Test accuracy: {test_acc:.3f}")
63/63 [==============================] - 196s 3s/step - loss: 0.6315 - accuracy: 0.9595 Test accuracy: 0.960
Conclusion: The VGG16 model with a test accuracy of 96.0% perform better compared to the CNN Model with a test accuracy of 72.2%
4.2 Confusion metricΒΆ
Confusion metric for the CNN Model
# Load the saved models
cnn_model = load_model('./models/cnn_from_scratch.keras')
# Make predictions
cnn_predictions = cnn_model.predict(test_images)
cnn_predictions = (cnn_predictions > 0.5).astype("int32").flatten()
# Plot the confusion matrix
cnn_cm = confusion_matrix(test_labels, cnn_predictions)
plt.figure(figsize=(6, 4))
sns.heatmap(cnn_cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklabels=class_names)
plt.title("CNN Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()
63/63 [==============================] - 17s 258ms/step
Confusion metric for the VGG16 Model
# Load the saved model (VGG16 with feature extraction) ---
vgg_model = load_model('./models/vgg16_model.keras')
# Get predictions from the model
vgg_predictions = vgg_model.predict(test_images)
vgg_predictions = (vgg_predictions > 0.5).astype("int32").flatten()
# Confusion Matrix
vgg_cm = confusion_matrix(test_labels, vgg_predictions)
plt.figure(figsize=(6, 4))
sns.heatmap(vgg_cm, annot=True, fmt='d', cmap='Greens', xticklabels=class_names, yticklabels=class_names)
plt.title("VGG16 Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()
63/63 [==============================] - 233s 4s/step
Conclusion: From the confusion matrix of the CNN Model, we could see that the model misclassified 318 cats as dogs and 268 dogs as cats while the confusion Matrix of the VGG16 Model shows that the model misclassified 31 cats as dogs and 50 dogs are cats. From these result we could say that the VGG16 Model did better in classifying the dogs and cats compared to the CNN Model.
4.3 Precision, Recall and F1-scoreΒΆ
Precision, Recall and F1-score for the CNN Model
# CNN Evaluation
print("CNN Classification Report:")
print(classification_report(test_labels, cnn_predictions, target_names=class_names))
CNN Classification Report:
precision recall f1-score support
cat 0.72 0.68 0.70 1000
dog 0.70 0.73 0.71 1000
accuracy 0.71 2000
macro avg 0.71 0.71 0.71 2000
weighted avg 0.71 0.71 0.71 2000
Precision, Recall and F1-score for the VGG16 Model
print("VGG16 Classification Report:")
print(classification_report(test_labels, vgg_predictions, target_names=class_names))
VGG16 Classification Report:
precision recall f1-score support
cat 0.95 0.97 0.96 1000
dog 0.97 0.95 0.96 1000
accuracy 0.96 2000
macro avg 0.96 0.96 0.96 2000
weighted avg 0.96 0.96 0.96 2000
Conclusion
For Cats:
- Precision: from the classification report the precision for the CNN Model was 72% and for the VGG16 Model was 95% which suggests that the VGG16 Model has better abilty compare to the CNN Model to identify cats.
- Recall: from the classification report the recall for the CNN Model was 68% and for the VGG16 Model was 97% which suggests that the VGG16 Model is a better model compared to the CNN Model in classifying actuall cats.
- F1-Score: from the classification report the f1-score for the CNN Model was 70% and for the VGG16 Model was 96% which suggests that the VGG16 Model has more balance between the recall and the precision compared to the CNN Model leading the better performance overall.
For Dogs:
- Precision: from the classification report the precision for the CNN Model was 70% and for the VGG16 Model was 97% which suggests that the VGG16 Model has has better abilty compare to the CNN Model to identify dogs.
- Recall: from the classification report the recall for the CNN Model was 73% and for the VGG16 Model was 95% which suggests that the VGG16 Model has a better model compared to the CNN Model in classifying actuall dogs.
- F1-Score: from the classification report the f1-score for the CNN Model was 71% and for the VGG16 Model was 96% which suggests that the VGG16 Model has more balance between the recall and the precision compared to the CNN Model leading the better performance overall.
In conclusion, from the classification report we could say that the VGG16 Model performs better compare to the CNN Model in the classification of cats and dogs.
# Precision-Recall Curve for CNN Model
cnn_probabilities = cnn_model.predict(test_images).flatten()
precision_cnn, recall_cnn, _ = precision_recall_curve(test_labels, cnn_probabilities)
plt.plot(recall_cnn, precision_cnn, label="CNN")
# Precision-Recall Curve for VGG16 Model
vgg_probabilities = vgg_model.predict(test_images).flatten()
precision_vgg, recall_vgg, _ = precision_recall_curve(test_labels, vgg_probabilities)
plt.plot(recall_vgg, precision_vgg, label="VGG16")
# Plot
plt.title("Precision-Recall Curve")
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.legend()
plt.grid(True)
plt.show()
63/63 [==============================] - 19s 297ms/step 63/63 [==============================] - 273s 4s/step
Conclusion:
- The VGG16 model maintains high precision levels throughout all recall levels which demonstrates its capability to detect positives with minimal errors. The model achieves 1.0 recall fast because it has a strong capability to detect correct positive instances.
- The CNN model displays a fast decrease in precision while recall levels increases which demonstrates its poor ability in reducing false positive errors when predicting positive cases.
- In conclusion, the CNN model struggles to achieve high accuracy as recall increases, but the VGG16 model does a far better job of striking a balance between precision and recall.
4.5 Explore specific examples in which the model failed to predict correctlyΒΆ
# Show wrong predictions from CNN
cnn_wrong_indices = np.where(cnn_predictions != test_labels)[0]
# Set up a 3x3 grid to display images
plt.figure(figsize=(10, 10))
# Display the some examples where the CNN model made wrong predictions:
for i in range(9):
idx = cnn_wrong_indices[i]
plt.subplot(3, 3, i+1)
plt.imshow(test_images[idx].astype("uint8"))
plt.title(f"CNN - Predicted: {class_names[cnn_predictions[idx]]}, Actual: {class_names[test_labels[idx]]}")
plt.axis('off')
plt.tight_layout()
plt.show()
Conculsion: Looking at these images the CNN Model should have classified them better because the cats and dogs in these images have distinct feature that could help classify them correctly.
# Display where the model made a wrong predictions from VGG16
vgg_wrong_indices = np.where(vgg_predictions != test_labels)[0]
# Set up a 3x3 grid to display images
plt.figure(figsize=(10, 10))
# Display the some examples where the CNN model made wrong predictions:
for i in range(9):
idx = vgg_wrong_indices[i]
plt.subplot(3, 3, i+1)
plt.imshow(test_images[idx].astype("uint8"))
plt.title(f"VGG16 - Predicted: {class_names[vgg_predictions[idx]]}, Actual: {class_names[test_labels[idx]]}")
plt.axis('off')
plt.tight_layout()
plt.show()
Conclusion: Looking at these images could say that the VGG16 model incorrectly identified these images due to low image quality, unclear stances, comparable appearances between cats and dogs, and deceptive contextual elements like background or animals being held together.
Final ConclusionΒΆ
In conclusion, The VGG16 pre-trained model and the CNN model were both successful in image classification; however, the VGG16 model, which used transfer learning, generally produced better results in terms of accuracy and generalization, while the CNN model, although capable of performing well, needs more computational resources and a larger amount of training data to match the performance of the transfer learning approach. This implies that, for similar image classification tasks, using pre-trained models can greatly cut down on training time and enhance model performance, particularly when the dataset is limited.